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EXECUTIVE  SUMMARY 


The  National  Security  Agency  (NSA)  conducts  research  and  development  to  meet 
the  needs  of  the  United  States  for  signals  intelligence  and  communications  security. 
Current  NSA  research  includes  development  of  a  General  Communications  Assessment 
Tool  (GCAT)  to  model  and  analyze  public  switched  telephone  networks  (PSTNs).  The 
GCAT  design  allows  modeling  of  many  PSTN  types,  where  each  type  follows  a  specific 
switching  protocol  to  route  calls.  A  PSTN  type  used  widely  throughout  the  world  is  a 
hierarchical  PSTN.  In  order  to  predict  call  routing  in  a  hierarchical  PSTN,  its 
hierarchical  structure  must  be  determined  (the  network  must  be  classified).  A  GCAT 
node  classifier  assigns  each  switch  in  the  PSTN  to  a  numerical  switching  level,  observing 
a  set  of  classification  rules.  After  all  switches  are  assigned  to  switching  levels,  GCAT 
predicts  communications  traffic  flow  through  the  PSTN  and  analyzes  PSTN  performance. 

Existing  GCAT  artificial  intelligence  software  for  node  classification  has  limited 
ability  to  classify  switches  in  hierarchical  PSTNs.  This  thesis  develops  and  tests  a  fast, 
robust  algorithm,  called  the  Top  Down  Node  Classification  Algorithm  (TDNCA),  to 
classify  switches  in  hierarchical  PSTNs.  An  abstraction  of  a  PSTN  is  a  connected 
network  whose  nodes  represent  switches,  and  whose  arcs  represent  connections  between 
switches.  Given  this  abstraction,  TDNCA  applies  graph-theoretic  techniques  to  infer  the 
hierarchical  switching  levels  of  the  network.  The  primary  objective  is  to  find  a 
classification  with  the  fewest  number  of  hierarchical  switching  levels,  because  real-world 
PSTNs  are  constructed  in  this  manner.  We  develop  bounds  for  the  minimum  number  of 
levels,  and  implicitly  enumerate  all  possible  classifications  for  each  network. 

TDNCA  observes  several  classification  rules  that  reflect  the  engineering  design  of 
hierarchical  PSTNs.  In  some  real-world  PSTNs,  the  actual  configuration  differs  from  the 
standard  hierarchical  design.  TDNCA  has  the  ability  to  use  soft  inferences  to  more 
accurately  classify  a  PSTN.  A  soft  inference  is  the  probability  that  a  switch  occupies  a 
certain  hierarchical  level,  based  on  engineering  characteristics  of  installed  PSTN 
switches. 


IX 


This  thesis  explores  three  different  ranking  criteria  for  PSTN  classifications.  One 
ranking  criterion  assumes  no  soft  inferences  exist.  The  remaining  criteria  count  the 
number  of  soft  inferences  that  are  satisfied  and  apply  a  quadratic  penalty  to  soft 
inferences  that  are  not  satisfied.  We  compare  the  results  from  the  three  ranking  criteria 
and  make  recommendations  for  further  research. 

TDNCA  is  implemented  in  Java  and  sample  PSTNs  are  classified  using  a  personal 
computer.  Solutions  are  obtained  in  under  one  second  for  actual  PSTNs.  Large  notional 
r  works  of  over  300  nodes  and  900  arcs,  developed  to  test  specific  aspects  of  the 
algorithm,  are  classified  in  under  one  minute.  TDNCA  is  faster  than  existing  GCAT 
software  and  can  be  easily  re-coded  into  C  or  C++  and  integrated  into  GCAT. 


x 


ACKNOWLEDGMENT 


I  am  sincerely  grateful  for  the  assistance  and  guidance  of  Prof.  Rob  Dell,  who 
made  key  suggestions  that  led  to  the  development  of  the  algorithm  presented  in  this 
thesis.  He  was  always  willing  to  listen  to  me  bounce  ideas  off  him  (string  and  styrofoam 
models  on  his  office  floor,  among  others),  usually  steering  me  back  into  thesis  reality.  In 
particular,  he  provided  valuable  instruction  on  how  to  write  concisely  and  actively. 

Additional  thanks  to  Dr.  Norm  Curet,  who  first  presented  this  problem  during  my 
six  week  experience  tour  in  the  NSA  OR  shop.  He  made  the  experience  tour  worthwhile 
and  productive,  providing  interesting  exposure  to  real  problems  and  an  environment 
favorable  to  solving  them. 

Prof.  Kevin  Wood  taught  me  network  theory  well  in  his  class  and  provided  a 
starting  point  for  algorithm  2  during  an  office  discussion.  Prof.  Jerry  Brown  provided 
thoughtful  comments  which  greatly  improved  the  quality  of  this  thesis. 

I  greatly  appreciate  the  assistance  of  Major  A1  Olson,  USMC,  whose  many 
discussions  helped  crystallize  some  of  my  ideas.  He  also  provided  many  of  the  figures 
that  appear  in  this  thesis. 


xi 


xii 


I.  INTRODUCTION 


A.  PROBLEM  DESCRIPTION 

Executive  Order  12333  (1981)  tasks  the  National  Security  Agency  (NSA)  with 
“collection  and  processing  of  signals  intelligence  (SIGINT)  information  for  national 
foreign  intelligence  purposes,”  “executing  the  responsibilities  of  the  Secretary  of 
Defense  as  executive  agent  for  the  communications  security  (INFOSEC)  of  the  United 
States  Government”  and  conducting  “research  and  development  to  meet  the  needs  of  the 
United  States  for  signals  intelligence  and  communications  security.” 

NSA  studies  United  States  and  foreign  communications  infrastructure  for 
purposes  of  exploitation  and  protection.  Current  NSA  research  includes  development  of 
a  General  Communications  Assessment  Tool  (GCAT)  to  model  and  analyze  public 
switched  telephone  networks  (PSTNs).  GCAT  (Figure  1)  has  several  major  functions, 
including  modeling  arbitrary  inter-connected  and  hierarchical  PSTN  topologies, 
predicting  communication  routes  through  PSTNs,  and  analyzing  PSTN  performance. 
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Figure  1:  Schematic  of  Generalized  Communications  Assessment  Tool  (GCAT) 


GCAT  develops  a  network  (a  set  of  nodes  connected  by  undirected,  unit-length 
arcs)  representative  of  a  PSTN  (a  set  of  switches  or  exchanges,  connected  by  trunk  lines 
or  links).  This  thesis  uses  the  term  PSTN  when  referring  to  an  actual  telecommunications 
switching  structure  and  just  the  term  network  when  referring  to  the  modeled  abstraction. 
A  GCAT  node  classifier  assigns  each  switch  in  the  PSTN  to  a  numerical  switching  level. 
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observing  a  set  of  classification  rules.  After  all  switches  are  assigned  to  switching  levels, 
the  route  builder  predicts  communications  traffic  flow  through  the  PSTN  and  analyzes 
PSTN  performance. 

GCAT  design  allows  modeling  of  many  PSTN  types,  where  each  PSTN  type 
follows  a  specific  switching  protocol  to  route  calls  through  the  PSTN.  A  PSTN  type  used 
widely  throughout  the  world  is  a  hierarchical  PSTN.  Existing  GCAT  artificial 
intelligence  (AI)  software  for  node  classification  has  limited  ability  to  classify  switches  in 
hierarchical  PSTNs.  This  thesis  develops  and  tests  a  fast,  robust  algorithm.  Top  Down 
Node  Classification  Algorithm(TDNCA),  to  classify  switches  in  hierarchical  PSTNs. 


B.  PUBLIC  SWITCHED  TELEPHONE  NETWORKS 

1.  Description  and  Terminology 

PSTNs  route  telephone  calls  worldwide.  Individual  telephones  connect  to 
regional  PSTNs  through  local  loops.  A  regional  hierarchical  PSTN  is  sized  to  serve  a 
regional  area  (e.g.,  Washington,  DC).  Regional  PSTNs  tie  together  with  sophisticated, 
high  capacity  switches  to  form  a  global  PSTN.  When  regional  PSTNs  combine, 
additional  switching  levels  route  traffic  between  the  regional  PSTNs.  A  call  from  one 
telephone  to  another  routes  through  a  local  loop  to  the  PSTN,  then  through  the  PSTN  to 
another  local  loop  which  connects  to  the  receiving  telephone.  The  connection  between 
the  two  telephones  is  established  on  demand  only.  When  the  call  completes,  the  PSTN 
releases  that  connection  and  regains  that  calling  capacity. 

Figure  2  illustrates  a  simple  hierarchical  PSTN  with  three  levels.  Each  exchange 
(or  switch )  in  a  regional  hierarchical  PSTN  occupies  one  of  four  possible  integer 
switching  levels,  also  referred  to  as  classes,  ranging  in  value  from  three  to  six.  The 
highest  switching  level  has  the  lowest  class  number.  A  switch  with  class  number  three  or 
four  is  a  large,  regional  exchange  known  as  a  transit  exchange  or  tandem.  Transit 
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exchanges  are  sophisticated  and  more  costly  than  switches  with  less  capability,  but  have 
greater  ability  to  route  calls  over  long  distances  (Ash  1998). 
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Figure  2:  A  simple  example  of  a  hierarchical  PSTN  with  three 
levels.  Letters  in  the  circles  are  switch  names.  Numbers  outside 
the  circles  are  switch  classes. 

An  exchange  with  class  number  five  is  an  end  office,  a  smaller  and  simpler  switch 
than  a  transit  exchange.  Most  exchanges  in  hierarchical  PSTNs  are  end  offices.  An 
exchange  with  class  number  six  is  a  remote,  the  simplest  switch  in  a  PSTN.  Remotes  and 
end  offices  are  collectively  known  as  local  exchanges. 

2.  Logical  and  Physical  Connections 

Physical  links  (copper  wires,  fiber  optic  cables  or  microwave  links),  also  known 
as  trunk  lines,  connect  PSTN  switches.  Actual  communications  between  switches  occur 
along  logical  connections,  defined  by  PSTN  communications  software.  Physical 
connections  can  be  very  different  from  logical  connections,  as  illustrated  in  Figure  3. 

The  GCAT  node  classifier  uses  logical  connections. 
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Figure  3:  A  simple  network  illustrating  the  difference  between  physical  and  logical 
connections.  Communications  between  nodes  occur  along  logical  connections, 
which  may  traverse  several  physical  connections.  In  the  two  diagrams,  the  nodes 
are  identical,  but  the  arcs  representing  connections  are  different. 


3.  Hierarchical  Switching  in  a  PSTN 

It  is  economically  infeasible  to  construct  PSTNs  with  sufficient  capacity  to 
connect  all  potential  users  simultaneously.  To  maximize  the  number  of  simultaneous 
connections  supported,  calls  in  PSTNs  are  routed  on  the  most  direct  path  available.  This 
consumes  the  least  switching  capacity  and  increases  the  probability  that  a  subsequent  call 
connects.  Fixed  hierarchical  routing  (a  characteristic  of  hierarchical  PSTNs)  is  the  most 
common  architecture  in  use  worldwide.  This  simple  strategy  was  necessary  several 
decades  ago,  when  most  PSTNs  were  constructed,  because  reliable  and  sophisticated 
switches  were  not  available.  Additionally,  fixed  hierarchical  routing  prevents  switching 
calls  back  on  themselves  or  using  an  excessive  number  of  links  on  a  call  (Ash  1998). 
With  fixed  hierarchical  routing,  each  switch  must  know  only  which  switches  it  can 
communicate  with  directly  and  a  priority  order  for  connections.  Each  switch  has  exactly 
one  parent  (or  final)  switch  (an  adjacent  switch  closer  to  the  top  level  of  the  hierarchy) 
unless  the  switch  is  itself  at  the  top  level.  The  parent  switch  is  the  last  switch  with  which 
a  lower  level  switch  attempts  to  communicate.  After  exhausting  all  other  switching 
possibilities,  a  switch  homes  on  its  parent  switch.  Figure  4  illustrates  fixed  hierarchical 
routing  in  a  three-level  PSTN. 
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The  arrow  indicates  the  direction  of  homing  for  switch  A. 

Switch  names  are  inside  the  circles.  Numbers  outside  the  Route  1:  The  direct  route  (path  A-B).  This  is  the 

circles  indicate  the  class  of  each  switch.  preferred  route  between  A  and  B,  using  a  direct  link. 


Route  2:  If  no  direct  link  is  available  between  A  and  B  or  Route  3:  If  the  connection  between  A  and  D  is  busy  or  no 
if  the  link  is  busy,  the  call  routes  through  switch  D  (the  direct  link  exists  between  A  and  D,  the  call  routes  through 

parent  of  B),  then  to  B  (path  A-D-B).  switch  C  (the  parent  of  A),  then  to  B  (path  A-C-B). 


Switch  A  has  homed  on  its  parent,  switch  C. 


Route  4:  If  the  connection  between  C  and  B  is  busy  or  no  Route  5:  The  final  route.  If  route  4  is  unavailable,  the 
direct  link  exists  between  C  and  B,  the  call  routes  through  call  routes  through  switch  E  (path  A-C-E-D-B).  If  all 
both  of  the  class  five  switches  (path  A-C-D-B).  circuits  on  the  final  route  are  full,  the  call  is  not 

completed. 

Figure  4:  Fixed  hierarchical  routing  in  a  three-level  PSTN.  Two  types  of  links  connect 
switches.  High-usage  or  direct  links  (dashed  lines)  connect  switches  that  have  sufficient  traffic 
to  make  a  direct  route  economical.  Final  links  are  the  links  between  each  switch  and  its  parent, 
together  with  the  final  links  between  all  switches  at  the  top  level  (solid  lines)  (Ash  1998). 
Hierarchical  routing  attempts  direct  routes  first,  then  overflow  calls  shift  toward  the  final  route. 
The  preferred  routes  traverse  the  fewest  links. 
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Basic  structures  (Figure  5)  can  be  connected  to  form  larger  PSTNs.  A  mesh  is  a 
maximally-connected  group  of  nodes,  called  a  clique  in  graph  theory.  Each  node  in  a 
mesh  can  communicate  directly  with  all  other  nodes,  allowing  high  traffic  density. 
Meshes  provide  high  reliability  and  are  common  in  urban  areas.  A  hub  and  spoke  is  a 
less  expensive  alternative  to  a  mesh,  since  the  switches  use  simpler  routing.  Calls  route 
directly  using  links  connecting  adjacent  outer  nodes.  Calls  to  more  distant  nodes  route 
through  the  central  node.  In  the  star  structure,  all  calls  route  through  the  central  node. 
Stars  are  common  in  rural  areas,  where  traffic  density  is  low.  Switching  is  most 
complicated  in  a  mesh  and  least  complicated  in  a  star. 


Figure  5:  Typical  PSTN  structures.  Meshes  (maximally  connected)  are  most 
versatile  and  expensive;  stars  (minimally  connected)  are  simplest  and  least 
expensive.  After  Olson  (1 998) 


Figure  6  illustrates  a  hierarchical  PSTN  containing  each  of  the  structures  shown  in 
Figure  5. 
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Figure  6:  Schematic  diagram  of  a  typical  hierarchical  PSTN  with  four  switching 
levels.  Nodes  A  and  D  are  the  most  sophisticated  and  highest  capacity  switches. 
From  Olson  (1 998) 


C.  THE  CLASSIFICATION  PROBLEM 

The  class  of  each  switch  in  the  target  PSTN  is  unknown.  The  classification 
problem  is  to  “reverse  engineer”  the  target  PSTN  and  determine  the  actual  switch  classes. 
Classification  rules  (constraints)  must  be  satisfied  and  classification  objectives  must  be 
achieved. 


1.  Terminology 

An  abstraction  of  a  PSTN  is  a  connected  network  G(N,  A)  consisting  of  the  set  of 
nodes  N  and  a  set  of  unit-length,  undirected  arcs  A.  This  thesis  uses  the  additional  terms: 

1)  level  or  class  number,  the  number  assigned  to  a  node  (integer  from  3  to  6); 

2)  top  level :  the  lowest  class  number  assigned  to  a  node; 

3)  top-level  node :  a  node  assigned  to  the  top  level  (the  highest  switching  level  in 
the  network); 
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4)  clique :  a  subset  of  nodes  N,  c  N  is  said  to  be  a  clique  if  every  pair  of  nodes  in 
N,  is  connected  by  an  arc  (Ahuja  et  al.  1993).  A  clique  forms  a  complete 
graph.  The  mesh  structure  of  Figure  5  is  an  example  of  a  clique;  and 

5)  solution :  an  assignment  of  class  numbers  to  all  network  nodes. 

2.  Classification  Rules 

Three  hierarchical  classification  rules  or  constraints  must  always  be  satisfied: 

1)  top-level  nodes  form  a  clique; 

2)  nodes  that  are  not  top-level  nodes  have  one  parent;  and 

3)  daughter  nodes  are  immediate  descendants  of  parents  (each  daughter  is  exactly 
one  class  number  higher  than  its  parent). 

3.  Classification  Objectives 

A  single  primary  objective  must  always  be  achieved: 

1)  find  a  solution  with  the  fewest  number  of  levels  that  satisfies  the  classification 
rules. 

Secondary  objectives  must  be  achieved  as  specified: 

1)  if  soft  inferences  exist  (see  next  section),  they  should  be  observed  whenever 
possible,  or 

2)  in  the  absence  of  soft  inferences,  the  solution  has  a  desired  ratio  of  top-level 
nodes  to  nodes  not  at  the  top  level. 

Figure  7  shows  an  example  of  a  simple  hierarchical  PSTN  with  a  correct 
classification  and  an  incorrect  classification. 
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Class  4 


Class  5 
O  Class  6 


Figure  7:  Examples  of  PSTN  classifications.  Arcs  indicate  logical  connections.  The 
correct  classification  follows  hierarchical  rules.  In  the  incorrect  classification,  top-level 
nodes  (class  4)  are  not  a  clique,  since  there  is  no  arc  from  node  C  to  node  F. 


4.  Soft  Inferences 

Soft  inferences  are  estimates  that  a  PSTN  switch  is  of  a  particular  class,  based  on 
the  engineering  characteristics  of  the  installed  equipment.  These  engineering 
characteristics  may  be  unknown,  but  if  known,  this  information  should  be  used  to  help 
classify  the  network.  In  the  existing  GCAT,  soft  inferences  derive  from  a  set  of  expert 
knowledge  rules,  based  on  equipment  characteristics.  In  the  list  that  follows,  the  rules  at 
the  top  of  the  list  are  better  indicators  of  switching  level  than  later  rules. 

1)  Common  Language  Location  Identification  (CLLI)  rule:  if  the  CLLI  code  of  a 
switch  ends  in  "T",  this  indicates  a  large  switching  capacity,  normally  associated  with  a 
transit  exchange. 

2)  NPACOC  rule:  if  the  NPACOC  code  of  a  switch  (which  indicates  the  number 
of  subscriber  loops  connected  to  the  switch)  is  “0”,  the  switch  probably  accomplishes 
trunk  routing  and  is  probably  a  transit  exchange. 

3)  Operating  Company  Name  (OCN)  rule:  if  the  OCN  of  a  switch  is  known  and  is 
not  the  most  common  OCN  of  the  network,  the  switch  is  most  likely  to  be  a  local 
exchange.  When  multiple  companies  cooperatively  build  a  PSTN  or  combine  existing 
PSTNs  the  dominant  company  tends  to  control  the  transit  exchanges. 
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4)  Equipment  rules:  certain  equipment  types  are  more  likely  to  be  associated  with 
certain  classes  of  switch.  Equipment  types  "4ESS,"  "DMS250,"  or  "DMS100"  are 
generally  associated  with  transit  exchanges.  Equipment  type  "DCO"  is  generally 
associated  with  local  exchanges. 


5.  Mathematical  Formulation  of  the  Classification  Problem 


A  mathematical  formulation  of  the  classification  problem  appears  below.  This 
formulation  is  provided  only  to  describe  the  problem. 


Indices 


i,  j  node  (1, 2, 3, ... ,  N). 

Data 

LINK  set  of  undirected  circs  (/,  j)  in  the  network. 

Variables 

cl  class  assigned  to  node  i; 

zclass  minimum  class  assigned  to  any  node; 

topi  1  if  node  i  is  at  the  top  level,  0  otherwise;  and 

p.  .  1  if  node  j  is  the  parent  of  node  i,  0  otherwise. 


Formulation 


There  are  two  objectives,  the  former  more  important  than  the  latter: 
maximize  zclass 
maximize  f(c/;,  cl2, ... ,  cls) 


subject  to: 

zclass  <  cl .  Vi 


top  t  +  top  j  <1 


(1) 

(2) 

(3) 

(4) 


V(i,  j)  £  LINK 
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top,  +  YaPi.i  =1  Vi 

j\{iJ)eUNK 

(5) 

cl  -  cl  J  -  4  ■  Pi  J  >  -3  V(i,  j)  €  LINK 

(6) 

Pu+Pjj*  1  V(f,  j)  e  LINK 

(7) 

cli  -  zclass  +  topi  >1  Vi 

(8) 

cl  -  zclass  +  3  •  topi  <3  Vi 

(9) 

p,;.  e{0,l}  V(i,  j)  e  LINK 

(10) 

topi  €  {0,  l}  Vi 

(11) 

c/,.  e{ 3, 4, 5, 6}  Vi 

(12) 

-The  objective  function  (1)  maximizes  the  class  number  of  the  top-level  nodes, 
which  is  the  same  as  minimizing  the  number  of  levels  in  the  network  (the  primary 
objective).  Objective  function  (2),  an  arbitrary  function  of  the  class  assigned  to  each 
node,  achieves  the  secondary  objectives.  Together,  these  objective  functions  achieve 
both  classification  objectives. 

The  constraints  (3)  determine  the  minimum  class  number  used.  This  is  a 
linearization  of  zclass  =  min  {c/(. } .  Constraints  (4)  specify  that  two  nodes  cannot  both  be 

1*6  N 

top-level  nodes  if  they  are  not  directly  connected  by  an  arc.  Constraints  (5)  require  every 
node  not  at  the  top  level  to  have  exactly  one  parent  node.  Constraints  (6)  specify  that  if 
node  j  is  the  parent  of  node  i,  then  node  i  must  have  a  class  number  at  least  one  higher 
than  node  j.  Constraints  (7)  prevent  two  nodes  from  being  parents  of  each  other. 
Constraints  (8)  and  (9)  specify  that  if  a  node  is  not  at  the  top  level,  its  class  number  must 
be  at  least  one  more  than  the  minimum  class  number  and  a  node  at  the  top  level  must 
have  its  class  number  equal  to  the  minimum  class  number.  Constraints  (10)  and  (1 1)  are 
binary  restrictions,  constraints  (12)  restrict  variable  cl  to  be  integer. 
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D.  OBJECTIVE  OF  CURRENT  RESEARCH 


The  existing  GCAT  node  classifier  applies  AI  in  the  form  of  expert  knowledge 
rules  to  determine  the  classification  of  network  nodes.  The  rules  are  programmed  in  the 
C-Language  Integrated  Production  System  (CLIPS),  an  expert  system  language 
developed  at  NASA's  Johnson  Space  Center  to  apply  computer  speed  to  rule-based 
decision  making  (Giarratano  and  Riley  1993).  The  AI  approach  to  node  classification  has 
several  drawbacks.  It  is  slow  to  converge  to  a  solution  and  is  not  analytically  accessible 
(it  is  essentially  a  “black  box”)  (Curet  1997). 

The  objective  of  this  research  is  to  develop  a  fast  node  classifier  for  regional 
hierarchical  PSTNs.  Inputs  to  the  node  classifier  are  the  logical  connections  between 
network'  nodes  as  undirected,  unit-length  arcs,  and  soft  inferences  (if  they  exist).  The 
node  classifier  outputs  a  list  of  nodes  with  assigned  class  numbers.  This  thesis  develops 
TDNCA  using  graph-theoretic  techniques  and  implements  it  in  Java. 
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II.  RELATED  RESEARCH 


Operations  Research  methods  have  contributed  directly  to  the  design  and 
expansion  of  telephone  networks,  however  nothing  in  the  published  literature  directly 
applies  to  the  node  classification  problem.  This  specialized  problem  has  been  researched 
solely  by  NSA. 

To  improve  GCAT  node  classification  speed  and  accuracy,  Curet  et  ah  (1998) 
developed  a  mixed  integer  linear  program  (MIP)  at  the  NSA  Center  for  Operations 
Research.  Olson  (1998)  continued  development  of  this  MIP  as  a  thesis  at  Naval 
Postgraduate  School  using  the  General  Algebraic  Modeling  System  (GAMS  1997)  to 
generate  the  model  and  Optimization  Subroutine  Library  (IBM  1998)  to  solve  it. 

Olson  modified  the  formulation  shown  on  page  10  in  several  ways.  He  uses  a 
single  objective  function  of  the  form 

maximize: 

ZWT  ■  zclass  -  TWT  ■  £  top t  +  PWT  •  £  bcl5i  +  X  Z  S0FTa  •  {bcla  )  (obj) 

i  i  i  c 

changes  the  variable  cl  to  a  binary  variable, 

bclci  a  binary  variable  which  is  1  if  node  i’s  class  is  c,  and  is  0  otherwise 

and  adds  the  following  parameters  as  weights  for  the  objective  function, 

ZWT  objective  coefficient  weight  for  minimum  node  class  used; 

TWT  objective  coefficient  weight  for  each  top  node; 

PWT  objective  coefficient  weight  for  nodes  that  are  not  the  smallest 

possible  class  given  a  choice  of  two  classes;  and 
SOFTci  soft  inference  parameter;  an  objective  function  weight  applied  to 

influence  the  class  c  assigned  to  node  i  in  the  final  solution. 
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Objective  function  (obj)  restates  lexicographic  objectives  (1)  and  (2)  as  a  single 
weighted  expression.  The  primary  objective  minimizes  the  number  of  levels  in  the 
classified  network.  This  is  accomplished  in  (obj)  by  the  constant,  ZWT  (typically  100), 
multiplied  by  the  lowest  class  number  in  the  network,  zclass.  The  secondary 
classification  objective  shapes  the  network  solution,  accomplished  in  (obj)  by  the 
remaining  terms.  A  constant,  TWT  (typically  about  0.5),  multiplied  by  the  number  of  top- 
level  nodes,  minimizes  the  number  of  nodes  at  the  top  level.  A  constant,  PWT  (typically 
about  0.1),  encourages  most  nodes  to  be  at  level  five.  The  final  term  allows  soft 
inferences  to  contribute  to  the  shape  of  the  solution. 

The  values  of  the  objective  function  weights  determine  the  relative  importance  of 
the  competing  classification  objectives.  ZWT  is  usually  much  larger  than  either  TWT  or 
PWT,  making  the  primary  classification  objective  the  most  important.  Given  a  set  of  top- 
level  nodes,  the  ratio  TWT/PWT  controls  the  shape  of  the  rest  of  the  network. 

In  the  mathematical  formulation  shown  on  page  10,  constraints  (4),  ensuring  that 
top-level  nodes  are  a  clique,  are  “weak”  constraints.  These  constraints  do  not  directly 
check  that  the  top-level  nodes  are  a  clique,  but  rather  ensure  that  among  all  pairs  of  nodes 
that  are  not  linked,  at  most  one  of  the  nodes  in  the  set  is  at  the  top  level.  This  appears  to 
cause  several  of  the  binary  and  integer  variables  to  take  on  fractional  values  in  the  linear 
programming  relaxation.  Olson  found  that  analysis  of  the  network  topology  identifies 
additional  constraints  that  tighten  the  formulation  of  the  MIP  and  improve  solution  time. 
TDNCA  uses  similar  but  more  extensive  graph-theoretic  techniques. 
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III.  TOP-DOWN  NODE  CLASSIFICATION  ALGORITHM 


A.  TERMINOLOGY 

An  abstraction  of  a  PSTN  is  a  connected  network  G(N,  A)  consisting  of  the  set  of 
nodes  N  and  a  set  of  unit-length,  undirected  arcs  A.  This  thesis  uses  the  additional  terms: 


(1)  SPi}  m  the  shortest  path  distance  from  node  i  to  node  j;  V  i,j  e  N; 

(2)  mSPi  =  max{SP. } ,  the  maximum  shortest  path  distance  from  node  i  to  any 

je  N  J 

other  node  in  the  network; 

(3)  maxSP  =  max {mSPi } ,  the  maximum  shortest  path  distance  between  any  two 

teN 

nodes  in  the  network; 

(4)  minSP  =  minf/nSf* } ,  mSP1  is  the  maximum  shortest  path  distance  from  node  i 


to  any  other  node.  minSP  is  the  minimum  of  these  maxima  and,  shown  below 

.  ,  .  .  ,  maxSP 

in  Property  4,  is  equal  to 


(5)  feasible  solution  =  a  solution  that  satisfies  all  classification  rules; 

(6)  acceptable  solution  =  a  feasible  solution  that  achieves  the  primary 
classification  objective; 

(7)  solution  pool  =  the  set  of  all  acceptable  solutions;  and 

(8)  optimal  solution  =  a  solution  from  the  solution  pool  that  best  achieves  the 
secondary  classification  objectives. 


B.  INTRODUCTION 


TDNCA  classifies  network  nodes  using  a  “top  down”  approach,  finding  all 
combinations  of  potential  top-level  nodes  that  produce  acceptable  solutions.  Simple 
addition  determines  the  solution  for  each  set  of  potential  top-level  nodes.  TDNCA 
selects  the  solution  which  most  favorably  achieves  the  secondary  objectives  as  the 
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optimal  solution.  To  illustrate  the  principles  of  TDNCA,  refer  to  the  small  example 
network  of  Figure  8.  Consider  two  cases  of  choosing  a  single  top-level  node. 

In  the  first  case,  classify  node  D  as  a  top-level  node,  assigned  class  number  four. 
Then,  assign  nodes  C,  E,  F  and  G  class  number  five,  since  they  are  adjacent  to  node  D. 
Assign  the  remaining  nodes,  separated  from  node  D  by  two  arcs,  class  number  six. 


Figure  8:  Example  network  showing  two  possible  classifications  with  a  single  node  at  the 
top  level.  Assigning  node  D  to  the  top  level  achieves  the  primary  objective. 


In  the  second  case,  classify  node  G  as  a  top-level  node,  assigned  class  number 
three.  Assign  nodes  D  and  H,  adjacent  to  node  G,  class  number  four.  Assign  nodes  C,  E, 
and  F,  separated  by  two  arcs  from  node  G,  class  number  five.  Assign  the  remaining 
nodes,  separated  by  three  arcs  from  node  G,  class  number  six.  Note  that  node  G  must  be 
assigned  class  number  three  if  it  is  the  only  top-level  node  so  that  all  node  class  numbers 
are  six  or  lower. 

Of  these  two  cases,  selection  of  node  D  as  a  single  top-level  node  is  preferable, 
since  this  achieves  the  primary  classification  objective:  a  solution  with  the  minimum 
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number  of  levels  (three).  If  node  G  is  selected  as  the  single  top-level  node,  the  solution 
has  four  levels.  Note  that  a  solution  follows  quickly  by  simple  addition  after  identifying 
top-level  nodes.  In  this  simple  example,  node  D  is  the  only  single  node  that  can  be  at  the 
top  level  and  result  in  a  solution  with  three  levels. 

A  physical  analogy  illustrates  the  idea  of  TDNCA.  Visualize  the  network  as 
marbles  (nodes)  connected  by  unit-length  string  segments  (arcs).  Lay  the  entire  network 
on  a  floor,  in  two  dimensions.  Select  a  set  of  top-level  marbles  that  form  a  clique.  Tape 
them  together  as  a  single  top-level  marble.  Pick  up  the  top-level  marbles  and  raise  them 
until  all  marbles  are  off  the  floor.  Then  lower  the  top-level  marbles  until  a  marble 
touches  the  floor.  Assign  the  marbles  on  the  floor  a  class  number  of  six.  Assign  the 
marbles  on  each  successive  higher  level  one  class  number  lower  than  the  level  below. 
When  the  top-level  marbles  are  assigned  a  class  number,  classification  is  complete. 

As  an  example,  select  node  (marble)  D  in  Figure  8  as  the  single  top-level  marble. 
Raise  marble  D  two  levels  (to  level  four).  Marbles  C,  E,  F  and  G  occupy  level  five  and 
marbles  A,  B  and  H  remain  on  the  floor  (level  six).  Return  all  marbles  to  the  floor  and 
select  marble  G  as  the  top-level  marble.  Marble  G  must  be  raised  three  levels  (to  level 
three)  to  classify  the  network.  These  are  the  results  tabulated  in  Figure  8. 

Still  considering  the  network  of  Figure  8,  the  minimum  number  of  levels  for  any 
feasible  solution  is  three.  It  is  possible  to  have  more  than  one  top-level  node  as  long  as 
all  top-level  nodes  form  a  clique,  but  this  does  not  produce  a  feasible  solution  with  two 
levels.  Selecting  node  C  along  with  node  D  raises  nodes  A  and  B  to  level  five,  but  node 
H  remains  at  level  six.  Selecting  node  G  along  with  node  D  raises  node  H  to  level  five, 
but  nodes  A  and  B  remain  at  level  six. 

TDNCA  finds  all  sets  of  top-level  nodes  that  result  in  acceptable  solutions.  In 
the  simple  example  above,  the  choice  is  obvious.  For  a  large  PSTN,  the  solution  may  not 
be  as  obvious.  TDNCA  uses  implicit  enumeration',  it  limits  potential  top-level  nodes  to 
the  subset  of  nodes  capable  of  being  at  the  top  level  of  an  acceptable  solution  and  then 
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finds  all  cliques  in  the  set  of  potential  top-level  nodes.  The  following  section  establishes 
the  mathematical  basis  for  finding  potential  top-level  nodes. 

C.  MATHEMATICAL  BASIS 

Six  properties  form  the  mathematical  basis  for  TDNCA.  Properties  1  and  2  bound 
the  number  of  levels  in  an  acceptable  solution.  Property  3  shows  that  multiple  top-level 
nodes  will  not  decrease  maxSP  by  more  than  one.  Property  4  establishes  the  relationship 
between  minSP  and  maxSP.  Properties  5  and  6  establish  achievable  lower  limits  for  the 
number  of  levels  in  a  feasible  solution. 

1.  Upper  Limit  on  Number  of  Levels 

Property  1:  It  is  always  possible  to  classify  a  network  with  minSP  +  1  levels. 
Proof:  Let  k  =  arg  min{mSP } .  Then  clt  =  6-  minSP  and  c/,.  =  clk+  SPki,  Vi  e  N 

«eN 

is  a  feasible  solution  with  minSP  +  1  levels. 

2.  Lower  Limit  on  Number  of  Levels 

Property  2:  It  is  not  possible  to  classify  a  network  with  less  than  minSP  levels. 

Proof:  Let  TP  be  any  set  of  top-level  nodes  in  a  feasible  solution  and  let 
L  =  max{min{.S'P  } } .  Then  clk  =  6  -  L,  Vk  e  TP  and  c/,  =6 -L  +  min{SPfa },  V/  e  N  \  TP 

je  N  /eTP  J  ke  TP 

is  a  feasible  solution  with  L  +  1  levels  (the  fewest  number  of  levels  for  the  set  TP).  Since 
all  top-level  nodes  form  a  clique  and  exactly  one  arc  separates  any  two  nodes  in  a  clique, 
for  any  k  e  TP,  SPkj  - 1  <  mm{SP  },  V/  e  N .  Therefore, 

ma  x{SPki }  - 1  <  max{min{SP  }}  -  L.  Since  minSP  - 1  <  max{SPw }  - 1  <  L ,  minSP  < 

je  N  J  je  N  t€TP  J  je  N  J 

L+l. 


18 


3.  Multiple  Top-Level  Nodes  Decrease  the  Number  of  Levels  by  at  Most 
One 

Let  TP  be  a  set  of  top-level  nodes  in  a  feasible  solution  and  let  G'  =  (N\  A')  be  a 
revised  network  formed  by  combining  all  nodes  in  TP  into  a  single  node.  Thus,  N'  = 
N\TP  u  {INI  +1}  and  A'  =  {(i,j)e A  I  i,j  gTP}  u  {(INI  +1  ,j)  I  (i,j)e A,  ieTPJeNYTP}. 

Property  3:  SPy  of  G'  ( SP°  )  is  greater  than  or  equal  to  SPy  of  G  ( SPy )  minus  1 
(SP?'>SP? -l)for(i,;-)eA'. 

Proof:  All  nodes  in  TP  must  form  a  clique.  Therefore,  SP^  can  be  at  most  one 
greater  than  SP°  . 


4.  Relationship  Between  minSP  and  maxSP 


Property  4:  minSP  = 


maxSP 

2 


Proof:  maxSP  must  be  even  or  odd.  For  the  case  of  maxSP  odd,  there  exists  a 

maxSP 


node  k  such  that  SPit  +  1  =  SPkj  = 


,  where  SPy  =  maxSP.  Assume  SPkj  *  minSP. 


This  implies  that  either  minSP  <  SPkj  or  minSP  >  SPkj.  If  minSP  <  SPkj ,  then  there  exists 
a  node  j'&  k  such  that  SP}y<  SPkj\/i' e  N.  This  implies  that  SPy+  SPy  <  SPik  +  SPkj, 
since  SPy  <  SPkj  and  SP.y<  SPk]-  SPjk  +  1  (SPy  <  SPik).  This  is  a  contradiction  that  SPy  = 
SPik  +  SPkj  is  a  shortest  path. 


If  minSP  >  SPkj ,  then  there  exists  a  node  j'e  N  such  that  SPky>  SPkj ,  implying 
either  SP..'=  SPik  +  SPky  >  maxSP  or  SP^  =  SPjk  +  SPk]-  >  maxSP,  a  contradiction  that  SPj} 
=  maxSP. 
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For  the  case  of  maxSP  even,  there  exists  a  node  k  such  that  SP.k  =  SPkj  =  — — — , 
where  SP-  =  maxSP  and  SPkJ  =  minSP.  The  proof  is  identical  to  the  odd  case. 

5.  A  Feasible  Solution  May  Exist  at  the  Lower  Limit 

Property  5:  If  maxSP  is  even,  no  feasible  solution  exists  with  minSP  levels. 

Proof:  From  the  definitions  of  minSP  and  maxSP  there  exists  a  node  k  such  that 
fn&xSP 

SPik  =  SPkj  =  — »  where  maxSP  =  SP. .  By  Property  4,  SPik  =  minSP.  By 

contradiction,  assume  we  can  form  a  solution  with  minSP  levels.  This  implies  that 

SP 

min{SPr }  <  minSP  - 1  and  min{SP,r }  <  minSP  - 1 .  Since  minSP  - 1  =  — —  - 1 , 

/€  TP  J  /e  TP  "  2 

min  {SP, }  +  min{SP..  . }  <  SP.  -  2  ,  contradicting  Property  3  since  this  implies 

j’e  TP  1  /eTP  JJ  ‘J 

SPC'  <  SPG  -  2 . 

IJ  —  IJ 

Property  6:  If  maxSP  is  odd,  the  minimum  number  of  levels  in  any  feasible 
solution  is  either  minSP  or  minSP  +  1 . 

Proof:  Consider  two  example  networks,  each  with  maxSP  odd.  In  the  first 
example  (Figure  9),  the  network  can  be  classified  with  two  levels  by  assigning  class 
number  five  to  nodes  2  and  3  and  class  number  six  to  nodes  1  and  4.  The  minimum 
number  of  levels  is  equal  to  minSP. 

o — © — © — 0 

Figure  9:  Example  network  with  maxSP  =  3,  minSP  =  2. 

In  the  second  example  (Figure  10),  the  network  cannot  be  classified  with  minSP 
levels.  Table  1  enumerates  all  possible  feasible  solutions  for  this  network.  Note  that 
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there  are  no  cliques  of  three  nodes  in  this  network.  The  minimum  number  of  levels  is 
three,  or  minSP  +  1. 


Figure  10:  Example  network  with  maxSP  =  3,  minSP  =  2.  The  all-pairs  shortest 
path  matrix  for  this  network  appears  to  the  right. 


Classification  with  top-level  nodes 


node 

1 

2 

3 

4 

5 

6 

1,2 

2,3 

2,4 

3,5 

4,6 

5,6 

1 

3 

5 

6 

6 

6 

6 

4 

5 

5 

6 

6 

6 

2 

4 

4 

5 

5 

5 

5 

4 

4 

4 

5 

5 

5 

3 

5 

5 

4 

6 

4 

5 

5 

4 

5 

4 

6 

4 

4 

5 

5 

6 

4 

5 

4 

5 

5 

4 

6 

4 

4 

5 

6 

6 

5 

6 

3 

4 

6 

5 

6 

4 

5 

3 

6 

6 

6 

6 

5 

4 

3 

6 

6 

5 

5 

4 

3 

levels 

4 

3 

3 

3 

4 

4 

3 

3 

3 

3 

3 

4 

Table  1:  All  possible  feasible  solutions  for  the  example  network  of  Figure  10. 
Each  column  is  a  feasible  solution.  Column  headings  indicate  the  top-level 
nodes  in  each  solution.  The  numbers  in  each  column  indicate  the  class 
number  of  the  node  in  that  row.  The  bottom  row  shows  the  number  of  levels 
in  each  solution. 


6.  Summary 


If  maxSP  is  even,  we  restrict  our  search  to  find  feasible  solutions  with  minSP  +1 
levels,  since  we  know  at  least  one  feasible  solution  with  minSP  +1  levels  exists. 


If  maxSP  is  odd,  we  initially  restrict  our  search  to  find  feasible  solutions  with 
minSP  levels;  one  may  exist.  If  no  solutions  are  found  with  minSP  levels,  we  look  for 
feasible  solutions  with  minSP  +  1  levels  and  are  guaranteed  to  find  at  least  one. 
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IV.  ALGORITHM  DESCRIPTION 


TDNCA  finds  all  cliques  of  top-level  nodes  that  produce  acceptable  solutions 
(feasible  solutions  with  the  minimum  number  of  levels),  then  finds  the  acceptable 
solution  that  best  achieves  the  secondary  classification  objectives.  TDNCA  has  five 
major  steps.  The  first  step  calculates  the  all-pairs  shortest  path  matrix  and  determines  the 
minimum  number  of  levels  in  a  feasible  solution.  Step  two  finds  all  nodes  that  can  be  at 
the  top  level  in  an  acceptable  solution.  Step  three  finds  each  clique  of  top-level  nodes 
that  results  in  an  acceptable  solution.  Step  four  classifies  the  network  for  each  clique  of 
top-level  nodes  found  in  step  three,  and  step  five  finds  the  solution  from  this  group  that 
best  satisfies  the  secondary  classification  objectives. 

A.  FIND  SHORTEST  PATHS 

TDNCA  finds  the  shortest  path  from  each  node  to  every  other  node  (SPtj)  using  a 
repeated  all-pairs  shortest  path  algorithm.  Since  all  arcs  are  of  unit  length,  breadth  first 
search  (BFS)  solves  the  shortest  path  problem  for  each  node  in  the  network.  All 
subsequent  calculations  use  the  all-pairs  solution. 

Below  is  the  pseudo  code  for  the  repeated  all-pairs  algorithm,  which  has 
complexity  0(N-S(BFS)),  where  S(BFS)  is  the  time  required  to  solve  BFS.  The 
complexity  of  BFS  is  O(A);  the  resulting  complexity  of  all-pairs  is  O(N-A)  (Ahuja  et  al. 
1993,  pp.  79,  145). 
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algorithm  1 :  all-pairs  shortest  path 

Input:  undirected  connected  network  G  =  (N,  A),  all  arcs  of  unit  length; 

Output:  N  x  N  matrix  of  the  shortest  path  distances,  SPsi,  V(s,  /)  e  N 

{ 

for  each  node  s  e  N  { 

SPsi  =  oo  V/  e  N; 

SPss  =  0; 

put  node  s  onto  FIFO  Queue; 
while  (Queue  not  empty)  { 
pop  node  /'from  Queue; 
for  (each  arc  (/,  j)  e  A (/), 

where  A (/)  is  the  set  of  arcs  incident  to  node  /)  { 
if  (SP sy=  °o)  { 

SPsj=  SPsi+  1; 
push  node  /  onto  Queue; 

} 

} 

} 

store  SPSi  V/'  e  N; 

} 

} 

B.  IDENTIFY  POTENTIAL  TOP-LEVEL  NODES 

TDNCA  identifies  a  set  P  c;  N  of  potential  top-level  nodes,  containing  only  those 
nodes  that  can  be  at  the  top  level  of  an  acceptable  solution.  Due  to  the  clique  constraint 
for  top-level  nodes,  adding  additional  nodes  to  the  set  of  top-level  nodes  reduces  the 
distance  from  any  node  to  the  nearest  top-level  node  by  at  most  one  (Property  3).  Thus, 
the  set  of  potential  top-level  nodes  P  =  {/eN  I  mSP  =  minSP}  for  a  solution  with  minSP 
levels  and  P  =  {/eN  I  minSP  <  mSP<  minSP  +  1 }  for  a  solution  with  minSP  +  1  levels. 
There  are  2P-1  possible  combinations  of  top-level  nodes.  Typically,  IPI  is  much  smaller 
than  INI. 
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C.  ENUMERATE  CLIQUES  OF  TOP-LEVEL  NODES 

Define  set  C  as  the  set  of  all  combinations  of  nodes  in  P.  Each  element  C;  e  C, 
i  =  1, 2, ... ,  ICI  is  a  unique  combination  of  nodes.  Define  set  C'cCas  the  set  of  all 
cliques  in  C.  Each  element  C'f  e  C\  i  =  1,  2, ... ,  IC'I  is  a  unique  clique  formed  from  the 
nodes  in  set  P.  If  all  nodes  in  P  form  a  clique,  C'  =  C,  otherwise  C'  <z  C.  TDNCA  does 
not  necessarily  enumerate  the  set  C;  it  implicitly  enumerates  all  possible  combinations  of 
the  nodes  in  P,  finding  the  set  of  cliques  C'  and  retaining  only  those  cliques  in  C'  that 
result  in  acceptable  solutions. 

Below  is  the  pseudo  code  for  the  enumeration  of  top-level  node  cliques.  Define 
ML  as  one  less  than  the  minimum  achievable  number  of  levels  in  a  feasible  solution.  If 
maxSP  is  even  (Property  5)  set  ML  =  minSP  (a  solution  with  minSP  +  1  levels).  If  maxSP 
is  odd  (Property  6)  initially  set  ML  =  minSP  -  1  (a  solution  with  minSP  levels).  Since  we 
are  not  guaranteed  to  find  a  solution  with  minSP  levels,  one  is  added  to  ML  when 
necessary. 
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algorithm  2:  enumerate  cliques  of  top-level  nodes 
Input:  set  of  potential  top  nodes  indexed  as  1,  2,  3, ,  IPI, 
the  all-pairs  shortest  path  matrix  and  ML; 

Output:  the  set  of  all  feasible  solutions  with  the  minimum  number  of  levels 
(solution  pool) 

{ 

push  1  on  LIFO  Stack; 
while  (Stack  not  empty)  { 

do  { 

if  (node  at  top  of  Stack  is  connected  to  each  node  on  Stack)  { 
if  ( max{  min  {SPit } }  =  ML)  { 

je  N  ksStack  1 

store  the  Stack  as  a  set  of  acceptable  top  nodes; 

} 

push  (value  on  top  of  Stack  + 1 )  on  Stack; 

} 

else  { 

pop  Stack; 

if  (value  popped  >  IPI)  { 
pop  Stack; 

} 

push  (value  popped  +  1)  on  Stack; 

} 

}  until  (node  at  top  of  Stack  >  IPI) 
pop  Stack; 
pop  Stack; 
if  (Stack  not  empty)  { 
pop  Stack; 

if  (value  popped  <  IPI)  { 

push  (value  popped  +  1)  on  Stack; 

} 

} 

} 

if  no  acceptable  solution  is  found,  ML  =  ML  +  1  and  repeat; 

} 


Figure  1 1  illustrates  algorithm  2  using  a  rooted  tree  constructed  from  the  nodes  in 
P.  TDNCA  explores  all  paths  from  the  tree  root  and  from  each  sub-root,  if  needed,  to 
find  all  elements  of  set  C.  In  Figure  11,  set  C  =  {(1),  (1,  3),  (1,  3,  5),  (1, 3, 5,  6), 

(1,  3,  5,  6,  9),  (1,  3,  5,  9),  (1,  3,  6),  (1,  3,  6,  9),  (1,  3,  9),  (1,  5),  (1,  5,  6),  (1,  5,  6,  9), 
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(1, 5, 9),  (1,  6),  (1, 6,  9),  (1,  9),  (3),  (3, 5), ... ,  (9)}.  Set  C  contains  2S-1  =  31  elements, 
all  possible  combinations  of  nodes  { 1,  3,  5,  6, 9}. 


Figure  11 :  Rooted  tree  illustrating  algorithm  2.  In  this  example,  the  set  P  of  potential 
top-level  nodes  consists  of  nodes  1 , 3,  5,  6,  and  9.  The  root  of  the  tree  is  the  lowest 
numbered  node  in  set  P.  Construct  the  second  level  with  the  remaining  nodes  in  P,  each 
connected  to  the  root.  Starting  from  each  node  in  the  second  level,  construct  sub-trees 
by  filling  the  level  below  the  root  of  each  sub-tree  with  all  nodes  in  set  P  numbered  higher 
than  the  root  of  the  sub-tree.  Continue  until  all  leaf  nodes  have  the  highest  node  number 
in  set  P.  The  arcs  in  the  rooted  tree  do  not  indicate  actual  connections  between  nodes  in 
graph  G.  The  all-pairs  shortest  path  matrix  to  the  right  tabulates  SP„  from  the  original 
graph  for  each  node  in  set  P.  Nodes  /  and  /  are  connected  if  SPy  =  i ,  otherwise  they  are 
not  connected. 


Given  set  C,  determining  if  each  element  C  forms  a  clique  is  straightforward  but 
computationally  tedious.  Algorithm  2  directly  determines  if  each  new  element  C  forms  a 
clique.  If  so,  it  continues.  If  not,  look  no  farther  along  that  branch  of  the  rooted  tree, 
since  adding  more  nodes  to  form  additional  cliques  is  fruitless.  Algorithm  2  thus  finds 
the  set  C'  without  explicitly  finding  each  element  of  set  C,  significantly  reducing 
computation  in  most  cases. 


Figure  1 1  shows  how  to  find  set  C\  starting  at  the  root  of  the  tree  and  proceeding 
first  down  the  left  branch  of  the  tree.  Each  added  node  in  a  path  forms  a  new  temporary 
set  CT.  If  the  nodes  in  CT  are  not  a  clique,  discard  CT  and  check  no  further  combinations 
containing  set  CT.  If  the  nodes  in  CT  are  a  clique,  store  CT  as  the  next  clique  C',,i=  1, 

2, ... ,  IC'I.  The  root  node  (node  1)  is  a  clique  and  becomes  C',.  The  next  set  of  nodes 
CT  =  (1,  3)  is  also  a  clique,  so  C'2  =  CT  =  (1,3).  Continuing  down  the  left  branch  of  the 
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tree,  add  node  5  to  set  CT.  Node  5  is  connected  to  node  3,  but  not  to  node  1  ( SPI5  *  1),  so 
the  set  of  nodes  CT  =  (1,  3, 5)  is  not  a  clique.  Adding  additional  nodes  to  CT  to  form  a 
clique  is  fruitless,  so  we  reverse  direction  up  the  tree  to  the  last  node  in  the  previous 
clique,  node  3.  Assign  to  CT  the  contents  of  the  previous  clique.  Continue  adding  nodes 
to  set  CT  moving  down  the  adjacent  untraversed  branch,  keeping  to  the  left  on  that 
branch.  After  adding  each  node  to  CT,  check  if  CT  is  a  clique.  If  so,  save  CT  as  the  next 
clique  in  the  sequence  C',  i  =  1,  2, ...  ,  IC'I.  If  not,  return  to  the  last  node  in  the  previous 
clique.  After  traversing  all  branches  from  the  root,  repeat  for  each  of  the  sub-trees. 
Subsequent  cliques  in  the  sequence  are  C'3  =  (1,  3,  6),  C'4  =  (1,  3,  6,  9),  C'5  =  (1,  3,  9),  C'6 
=  (1, 6),  C'7  =  (1,6,  9),  C\  =  (1, 9),  C\  =  (3),  C'l0  =  (3, 5),  C',,  =  (3,  5,  6),  and  so  on.  Set 
C'  contains  23  elements,  the  number  of  cliques  in  set  P. 

While  filling  the  set  C\  TDNCA  checks  that  each  clique  C'.  e  C'  results  in  an 
acceptable  solution.  An  acceptable  solution  for  clique  C  exists  if 
max  {minis/5,. } }  =  ML .  If  not,  clique  C'.  is  discarded.  To  illustrate  with  Table  2,  we 

;eN\C;  keC)  1 

calculate  ML  =  2.  Consider  two  cliques  of  top-level  nodes,  C',  =  (G)  and  C'2  =  (C,  D). 


mSPi 

node  / 
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3 

2 

1 

2 

2 

0 

1 

4 
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4 

4 

3 

2 

3 

3 

1 

0 

Table  2:  Example  all-pairs  shortest  path  matrix  for  the  network  shown  in 
Figure  8.  For  this  network,  maxSP=  4,  minSP  =  2  and  ML  =  2. 


For  clique  C',,  observe  that  max  {min {SP. } }  =  3  arcs  separate  node  G  from  the  most 

ye  N\Cj  JfceC'j  J 

distant  node(s)  in  the  network.  We  discard  the  clique  C',  as  unacceptable  since  3  >  ML. 
For  clique  C'2,  max  {minfS/^ } }  =  2  arcs  separate  any  node  {j  e  N\C',}  from  either  top- 
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level  node  in  clique  C'r  Since  ML  =  2,  we  retain  clique  C'2as  acceptable.  The  solution 
with  top-level  node  G  has  four  levels,  while  the  solution  with  top-level  nodes  C  and  D 
has  only  three  levels  and  achieves  the  primary  objective. 

D.  DEVELOP  SOLUTIONS 

After  algorithm  2,  each  clique  of  top-level  nodes  remaining  in  set  C'  produces  an 
acceptable  solution.  For  each  clique  C  in  set  C\  we  classify  the  network  by  assigning 
each  node  a  class  number  clk  =6-  ML  +  min  {SPik},Vk  e  N .  This  satisfies  the 

K  ieCj 

remaining  classification  rules,  with  each  node  classified  exactly  one  level  below  its 
parent.  The  resulting  solutions  form  the  solution  pool. 

E.  FIND  AN  OPTIMUM  SOLUTION 

All  solutions  in  the  solution  pool  satisfy  the  primary  classification  objective: 
classification  with  the  fewest  number  of  levels.  TDNCA  finds  the  solution  in  the  solution 
pool  that  best  achieves  the  secondary  classification  objectives.  This  thesis  explores  three 
methods  for  determining  the  “best”  solution.  Two  methods  fit  the  soft  inferences;  one 
uses  a  non-linear  penalty  function  and  the  other  counts  the  number  of  satisfied  soft 
inferences.  In  the  absence  of  soft  inferences,  the  third  method  maximizes  a  linear 
objective  function  similar  to  that  used  by  Olson. 

1.  Penalty  function 

Calculate  a  non-linear  penalty  for  each  node,  a  function  of  the  node  class  and  the 
class  suggested  by  the  soft  inferences.  Expert  knowledge  rules  determine  a  soft  inference 
for  each  node,  expressed  as  the  probability  that  the  node  is  a  transit  exchange,  0  <  SOFTj 

{(l- SOFT)2  if  cl  <  4] 

<  1,  Vi  €  N.  The  penalty  for  each  node  is  PEN,  =  <  '  '  >,  VieN. 

F  3  '  [SOFT2  if  cl,  >5j 
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N 

The  total  penalty  for  each  pool  solution  is  ^  PEN, .  An  optimal  solution  by  this  method 

i= 1 

has  the  smallest  total  penalty. 


2.  Count  of  Soft  Inference  Matches 


Total  the  matches  between  node  class  and  the  class  suggested  by  soft  inferences. 
For  purposes  of  this  method,  node  /  is  considered  to  be  a  transit  exchange  if  SOFTj  >  0.5 
and  a  local  exchange  if  SOFT  <  0.5.  If  SOFTi  =  0.5,  any  classification  for  the  node 
matches  the  soft  inference,  so  we  ignore  this  case.  The  total  number  of  matches  for  a 


N 

solution  becomes  ^ 

f=I 


[[SOFT,  +0.5J 

{l-|S0F7;  +0.5J 


if  cl,  <  4] 
if  cl,  >  5  } 


An  optimal  solution  by  this 


method  has  the  greatest  number  of  matches. 


3.  Linear  Objective  function 

In  the  absence  of  soft  inferences,  TDNCA  maximizes  the  objective  function 
ZWT  ■  zclass  -  TWT  •  £  top,  +  PWT  ■  £l,  with  ZWT  =  1 ,000,  TWT  =  5  and  PWT  =  2. 

;  iic/,  =5 

Olson  (1998)  uses  similar  weights.  This  objective  function  is  Olson’s  without  his  soft 
inference  term. 
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V.  COMPUTATIONAL  RESULTS 


A.  HARDWARE  AND  SOFTWARE 

TDNCA  is  implemented  in  Java  (Sun  Microsystems  1998).  Reported  results  are 
obtained  on  a  333  megahertz  Pentium  II  personal  computer  (PC),  operating  Windows  NT 
4.0  and  the  Microsoft  Virtual  Machine,  a  just-in-time  Java  compiler  provided  with  Code 
Warrior  3.0  (Metroworks  1998). 

B.  TEST  NETWORKS 

TDNCA  classifies  23  test  networks,  the  same  networks  used  by  Olson  (1998). 

Test  networks  include  nine  actual  or  modified  U.S.  regional  PSTNs,  seven  notional 
networks  constructed  to  test  model  robustness,  and  seven  networks  modified  from  U.S. 
regional  PSTNs.  Table  3  summarizes  all  test  networks. 

1.  Networks  Derived  from  U.S.  Regional  PSTNs 

Nine  test  networks  derive  from  U.S.  regional  PSTNs  obtained  from  open  sources. 
These  networks  typify  regional  PSTNs.  Net-0  models  an  actual  regional  PSTN  modified 
to  have  a  leafy  structure  (a  structure  with  few  meshes  or  hubs)  and  four  classes. 

Network- 1  and  Network-2  are  variations  of  an  actual  network  with  a  single  central  transit 
exchange.  Network-3  and  Network-4  are  also  variations  of  an  actual  network  with 
multiple  transit  exchanges.  Network-4  has  an  additional  mesh  attached  to  a  transit 
exchange.  Network-5  and  Network-6  are  the  most  complex  in  this  group,  with  multiple 
transit  exchanges  and  meshes.  Balt  and  Tracy  are  actual  regional  networks.  Soft 
inferences,  based  on  available  technical  specifications,  exist  for  all  networks  in  this  group 
except  Net-0.  Appendix  A  contains  diagrams  of  the  nine  networks  in  this  group. 
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2. 


Notional  Networks 


Olson  (1998)  constructs  seven  notional  networks  that  are  more  difficult  to 
classify.  Each  notional  network  combines  several  of  the  networks  derived  from  U.S. 
regional  PSTNs,  tied  together  with  additional  arcs  as  needed.  Notional  networks  are 
intended  to  represent  real-world  PSTNs,  but  on  a  larger  scale. 

3.  Modified  Networks 

All  but  one  of  the  test  networks  described  above  has  three  switching  levels.  To 
test  classification  of  PSTNs  with  more  than  three  switching  levels,  Olson  constructs 
seven  additional  modified  networks,  each  identified  by  the  name  “Lop”,  followed  by  the 
number  of  the  original  test  network,  then  by  a  letter.  The  letter  “a”  indicates  addition  of 
one  node  to  increase  maxSP  by  one;  the  letter  “b”  indicates  addition  of  two  nodes 
increase  maxSP  by  two. 
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Network 

Location 

No.  of 
Nodes 

No.  of 
Arcs 

No.  of 
Top- 
Level 
Nodes 

Max 

Node 

Degree 

maxSP 

No.  of 
Levels  in 
Network 

Networks  derived  from  U.S.  regional  PSTNs 

Net-0 

Notional 

21 

21 

1 

3 

6 

4 

Network- 1 

Baltimore  area 

27 

32 

1 

21 

4 

3 

Network-2 

Baltimore  area 

38 

47 

1 

30 

4 

3 

Network-3 

Georgia 

34 

38 

2 

10 

5 

3 

Network-4 

Georgia 

34 

46 

3 

12 

5 

3 

Network-5 

D.C.  and  N.  VA 

34 

79 

2 

17 

4 

3 

Network-6 

D.C.  and  N.  VA 

42 

96 

4 

16 

5 

3 

Tracy 

California 

90 

90 

2 

31 

5 

3 

Balt 

Baltimore  area 

103 

103 

3 

66 

5 

3 

Large  Notional  Networks 

5_6 

Aggregation 

76 

183 

6 

21 

5 

3 

4_6 

Aggregation 

110 

247 

9 

24 

5 

3 

3_6 

Aggregation 

144 

320 

12 

27 

5 

3 

HugeC 

Aggregation 

118 

313 

10 

25 

5 

3 

HugeB 

Aggregation 

152 

402 

12 

27 

5 

3 

HugeA 

Aggregation 

220 

575 

18 

33 

5 

3 

Huge 

Aggregation 

304 

948 

24 

39 

5 

3 

Networks  with  modified  Longest-Shortest  Paths 

Lop4a 

Modified  Network-4 

35 

47 

3 

12 

6 

4 

Lop4b 

Modified  Network-4 

36 

48 

3 

12 

7 

4 

Lop5a 

Modified  Network-5 

35 

81 

3 

17 

5 

3 

Lop5b 

Modified  Network-5 

36 

82 

2 

17 

6 

4 

Lop6 

Modified  Network-6 

41 

95 

3 

16 

5 

3 

Lop6a 

Modified  Network-6 

43 

97 

3 

16 

6 

4 

Lop6b 

Modified  Network-6 

44 

98 

3 

16 

7 

4 

Table  3:  Summary  of  test  networks,  adapted  from  Olson  (1 998). 


C.  SOFT  INFERENCE  VALUES 

Combining  expert  knowledge  rules  to  produce  soft  inferences  is  not  necessarily 
straightforward.  Each  rule  provides  a  separate  indicator  of  the  class  of  a  switch.  If 
several  rules  independently  indicate  that  a  switch  is  a  certain  type,  the  combined 
inference  is  stronger  than  any  individual  rule,  but  perhaps  only  slightly  stronger  than  the 
strongest  single  rule.  An  inference  may  also  be  incorrect.  For  example,  a  soft  inference 
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may  be  based  on  strong  indications  that  a  switch  is  a  transit  exchange,  while  in  fact  the 
switch  is  a  local  exchange.  The  existing  GCAT  AI  algorithm  provides  individual  rule 
weights  (Curet  1997).  This  thesis  uses  estimates  of  combined  rule  weights  to  provide 
soft  inferences,  shown  in  Table  4.  Soft  inferences  have  been  formulated  for  purposes  of 
testing  our  methods,  and  do  not  have  any  specific  technical  motivation. 


Applicable  Expert  Knowledge  Rules 

SOFT; 

OCN,  DCO 

0.43 

OCN 

0.45 

OCN,  DMS100 

0.47 

DCO 

0.48 

DMS100 

0.52 

CLLI,  OCN,  DCO 

0.55 

NPACOC,  OCN 

0.55 

DMS250 

0.57 

4ESS 

0.60 

CLLI,  OCN 

0.60 

NPACOC 

0.60 

CLLI,  DMS100 

0.67 

CLLI 

0.75 

CLLI,  NPACOC,  OCN 

0.77 

CLLI,  NPACOC 

0.80 

CLLI,  OCN,  DMS250 

0.84 

CLLI,  DMS250 

0.89 

CLLI,  4ESS 

0.95 

Table  4:  Soft  inference  values  for  combinations  of  expert  knowledge 
rules.  SOFT,  =  the  probability  that  node  /  is  a  transit  exchange. 

Table  5  summarizes  soft  inference  values  for  two  real-world  PSTNs.  Exact 
values  of  soft  inferences  for  each  test  network  appear  in  Appendix  B. 
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Inference  Meaning 

SOFT. 

Network  Tracy  Nodes 

Network  Balt  Nodes 

weak  suggestion  the  node 
is  not  a  transit  exchange 

<0.5 

0,  2,  6,  8,  11,28, 29,32, 
34,40,43,46,47, 52,  55, 
57,61,668,71,72,  79, 

88,  89 

16,  27, 48,  86,  89,92 

weak  suggestion  the  node 
is  a  transit  exchange 

0.5  to  0.7 

12, 19,43,63,  67,76, 

84,  87 

2, 7, 14, 23,  52,53,54 

strong  suggestion  the  node 
is  a  transit  exchange 

>0.7 

20,  58,  80,  81 

18,48,  68,  84,  87,  102 

Table  5:  Soft  inferences  grouped  into  categories  for  two  real-world  PSTNs. 
Nodes  with  strong  inferences  are  most  likely  to  be  classified  as  top-level  nodes 
when  considering  soft  inferences.  Weak  inferences  have  less  influence  on 
solutions. 


D.  TEST  RESULTS 

TDNCA  classifies  most  test  networks  in  less  than  one  second  and  never  needs 
more  than  67  seconds.  For  testing  purposes,  TDNCA  calculates  three  different  objective 
functions  for  each  network.  Table  6  summarizes  computational  performance  for  all  test 
networks. 

The  slower  classification  of  network  Huge  results  from  the  large  number  of 
potential  top-level  nodes.  The  number  of  arcs,  INI,  in  network  Huge  is  304.  There  are 
2304  -  1  possible  combinations  of  these  nodes.  However,  of  these  304  nodes,  only  IPI  =  24 
nodes  can  be  at  the  top-level  of  an  acceptable  solution.  All  24  of  these  nodes  form  a 
clique,  so  there  are  ICI  =  IC'I  =  224  -  1  or  16,777,215  cliques  of  top-level  nodes.  TDNCA 
completely  enumerates  these  cliques  to  find  20,736  that  result  in  acceptable  solutions. 
The  best  solution  is  chosen  from  among  this  group. 
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Network 

Number  of 
Nodes 

IN! 

Number  of 
Arcs 

IAI 

Number  of 
feasible  top-level 
combinations  of 
nodes 

Number  of 
solutions  in 
pool 

IC'I 

Time  To 
Classify 
(secs) 

Net-0 

21 

21 

7 

4 

0.3 

Network- 1 

27 

32 

35 

18 

0.3 

Network-2 

38 

47 

55 

28 

0.3 

Network-3 

34 

38 

3 

1 

0.3 

Network -4 

34 

46 

7 

1 

0.3 

34 

79 

277 

139 

0.4 

42 

96 

19 

8 

0.3 

90 

90 

7 

2 

0.3 

Balt 

103 

103 

7 

2 

0.3 

5J3 

76 

183 

63 

12 

0.3 

4_6 

110 

247 

511 

12 

0.3 

3_6 

144 

320 

4095 

12 

0.3 

HugeC 

118 

313 

1023 

72 

0.4 

HugeB 

152 

4095 

144 

0.4 

HugeA 

220 

B 

262,143 

144 

1.1 

Huge 

304 

948 

16,777,215 

67 

Lop4a 

35 

47 

37 

19 

Lop4b 

36 

48 

7 

7 

0.3 

r 

o 

T3 

35 

81 

263 

98 

Lop5b 

36 

82 

279 

205 

Lop6 

41 

95 

289 

212 

0.4 

Lop6a 

43 

97 

282 

206 

0.4 

Lop6b 

44 

98 

19 

8 

0.3 

Table  6:  Summary  of  classification  performance  for  all  test  networks.  Calculation  of  the 
all-pairs  shortest  path  matrix  requires  less  than  0.1  seconds  in  all  cases.  Enumeration  of 
feasible  top-level  sets  requires  less  than  0.7  seconds  for  all  networks  except  Huge,  which 
requires  46  seconds.  Time  to  classify  includes  input,  output  and  all  calculations.  The 
test  machine  is  a  333  megahertz  Pentium  II  PC. 

The  solution  pool  contains  all  solutions  that  achieve  the  primary  classification 
objective:  classification  with  the  minimum  number  of  levels.  This  thesis  ranks  solutions 
from  the  solution  pool  using  three  different  optimality  criteria  (Table  7).  In  all  cases,  at 
least  one  of  the  optimality  criteria  identifies  the  solution  for  each  network  that  correctly 
classifies  the  network.  In  a  correct  classification,  each  assigned  node  class  matches  the 
actual  class  of  the  node  in  the  modeled  PSTN.  In  all  but  one  case  (network  Balt),  the 
linear  objective  function  correctly  classifies  the  network. 
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Optimal  Sets  of  Top-Level  Nodes  • 

Network 

Maximum 
Objective  Function 

Minimum  Penalty 
Function 

Highest  Count  of  Soft 
Inference  Matches 

Net-0 

13 

no  soft  inferences 

Network- 1 

14 

14 

3, 14 

Network-2 

14 

14,31 

3,  14 

Network-3 

9,28 

9,28 

9,28 

Network-4 

7,9,28 

7, 9,28 

7,9,28 

Network-5 

21, 31 

0,  31  25,  31 

21,31  21,24,31 

Network-6 

6, 15, 19,37 

6, 15, 19,37 

6, 15, 19, 37 

Tracy 

20,81 

20,  58,  81 

20,  58,  81 

Balt 

68,84 

68, 84, 87 

68, 84, 87 

Table  7:  Optimal  top-level  node  sets  for  test  networks  derived  from  actual  U.S. 
PSTNs.  The  set  of  top-level  nodes  shown  in  each  block  best  achieves  the 
indicated  objective.  Split  blocks  indicate  multiple  optimal  solutions.  Sets  in  bold 
numbers  exactly  match  the  true  configuration  of  the  test  network.  Sets  in  italics 
fail  to  classify  one  actual  top-level  node,  classify  an  additional  top-level  node,  or 
both. 

Soft  inferences  influence  the  selection  of  optimal  solutions.  In  the  following 
paragraphs,  we  examine  the  classification  results  for  networks  Tracy  and  Balt  and 
illustrate  the  influence  of  soft  inferences. 


Classification  of  network  Balt  using  the  maximum  objective  function  criterion 
identifies  two  top-level  nodes,  nodes  68  and  84.  The  actual  Balt  network  has  three  top- 
level  nodes,  nodes  68,  84  and  87.  For  network  Balt,  ML  =  2,  maxSP  =  5,  and  minSP  =  3. 
Observe  that  node  87  has  only  three  adjacent  nodes  (see  diagram  in  Appendix  A), 
mSP87  =  3  and  nodes  68,  84  and  87  form  a  clique.  The  objective  function  minimizes  the 
number  of  top-level  nodes,  resulting  in  exclusion  of  node  87  from  the  top  level.  Nodes 
68,  84  and  87  all  have  strong  inferences  that  they  are  transit  exchanges  (Table  5). 
Optimality  functions  that  include  soft  inferences  benefit  from  placing  node  87  at  the  top 
level,  since  a  strong  soft  inference  is  satisfied.  The  soft  inferences  in  the  case  of  network 
Balt  contribute  to  a  correct  solution. 


Classification  of  network  Tracy  using  the  maximum  objective  function  criterion 
identifies  the  correct  solution,  but  classifications  using  soft  inferences  are  incorrect. 
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Node  58  in  network  Tracy  is  only  adjacent  to  nodes  20  and  81  (see  diagram  in  Appendix 
A).  The  objective  function  excludes  node  58  from  the  top  level  to  minimize  the  number 
of  top-level  nodes.  The  soft  inference  for  node  58  strongly  suggests  a  transit  exchange, 
so  the  optimality  functions  that  include  soft  inferences  place  node  58  at  the  top  level. 

The  actual  Tracy  network  has  only  two  top-level  nodes. 

A  single  node  with  few  incident  arcs  and  a  strong  soft  inference  leads  to  the 
classification  errors  in  networks  Balt  and  Tracy.  For  all  test  networks,  the  solution  pool 
contains  the  correct  solution  and  at  least  one  of  the  optimality  criteria  identifies  the 
correct  solution. 
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VI.  CONCLUSIONS  AND  RECOMMENDATIONS 


A.  CONCLUSIONS 

TDNCA  quickly  classifies  all  test  networks,  including  two  networks  that  model 
real-world  PSTNs.  Solutions  are  obtained  in  under  one  second  for  nine  real-world 
PSTNs,  and  large  notional  networks  of  over  300  nodes  and  900  arcs  are  classified  in 
under  one  minute.  The  solution  speed  of  TDNCA  is  due  to  custom  algorithms  written  for 
the  purpose  of  node  classification  and  implemented  in  Java.  TDNCA  outperforms 
existing  node  classification  software  and  appears  fast  enough  to  allow  classification  of 
any  real-world  PSTN  in  less  than  one  minute. 

B.  COMPARISON  TO  PREVIOUS  RESEARCH 

TDNCA  has  several  advantages  over  previous  efforts: 

1)  TDNCA  is  written  in  Java,  using  well  known  graph-theoretic  techniques  and 
simple  programming.  It  does  not  require  the  use  of  commercial  software  packages  and 
proprietary  algorithms.  TDNCA  can  be  used  as  written  in  Java  or  can  be  easily  re-coded 
in  C  or  C++  and  integrated  into  GCAT. 

2)  TDNCA  classifies  test  networks  more  quickly  than  previous  methods  and  with 
equal  accuracy.  We  obtain  the  same  results  as  Olson  for  all  test  networks  when  using 
Olson’s  objective  function. 

3)  TDNCA  can  evaluate  multiple  optimal  or  nearly  optimal  solutions  in  the 
neighborhood  of  the  optimal  solution.  Once  TDNCA  finds  the  solution  pool,  solutions 
can  be  rank-ordered  using  any  desired  fitness  function.  Fitness  functions  need  not  be 
linear. 


The  advantage  of  the  MIP  used  by  Olson  (1998)  is  a  flexible  formulation.  Olson 
does  not  attempt  to  classify  a  network  with  a  minimum  number  of  levels,  but  evaluates 
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solutions  solely  based  on  a  numerical  objective.  Typical  MIP  objective  function  weights 
result  in  classification  with  a  minimum  number  of  levels,  but  varying  the  weights  can 
produce  solutions  with  more  than  the  minimum  achievable  number  of  levels. 

C.  RECOMMENDATIONS  FOR  FURTHER  RESEARCH 

TDNCA  strictly  achieves  the  primary  objective,  classification  of  networks  with 
the  minimum  number  of  levels.  If  the  MIP’s  flexibility  to  classify  solutions  with  more 
than  the  minimum  number  of  levels  is  desired,  TDNCA  can  be  modified  to  do  so. 

Further  research  is  required  to  numerically  define  the  “best”  classification 
criterion.  This  thesis  explores  three  methods,  but  no  single  method  correctly  classifies  all 
test  networks.  Use  of  expert  knowledge  rules  to  determine  soft  inferences  has  promise, 
but  soft  inferences  may  lead  to  incorrect  classification.  Consistently  classifying  PSTNs 
with  no  errors  may  require  a  more  sophisticated  application  of  soft  inferences. 

Soft  inferences  used  in  this  thesis  indicate  the  probability  that  a  node  is  a  transit 
exchange  (class  three  or  class  four).  TDNCA  can  easily  be  modified  to  use  more  detailed 
soft  inferences.  For  example,  a  node  might  have  a  probability  of  0.2  of  being  class  four 
or  higher,  a  probability  of  0.7  of  being  class  five  and  probability  of  0.1  of  being  class  six. 
More  detailed  soft  inferences  such  as  these  may  help  to  correctly  classify  real-world 
PSTNs. 
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APPENDIX  A.  TEST  NETWORKS 

This  appendix  contains  diagrams  of  each  of  the  test  networks,  from  Olson  (1998). 
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APPENDIX  B.  SOFT  INFERENCES 


This  appendix  lists  the  soft  inferences  generated  for  each  of  the  test  networks. 
Net-0  has  no  soft  inferences.  P(top)  is  the  probability  that  a  node  is  a  transit  exchange.  If 
p(top)  >  0.5,  the  node  is  most  likely  a  transit  exchange.  If  p(top)  <  0.5,  the  node  is  most 
likely  a  local  exchange. 


Network-2  Soft  Inferences 

node 

p(top) 

0 

0.48 

1 

0.48 

2 

0.43 

5 

0.57 

6 

0.48 

7 

0.48 

9 

0.48 

14 

0.89 

16 

0.48 

17 

0.43 

18 

0.43 

19 

0.48 

20 

0.48 

22 

0.48 

26 

0.48 

27 

0.55 

30 

0.43 

31 

0.55 

33 

0.57 

34 

0.57 

Network-3  Soft  Inferences 

node 

p(top) 

0 

0.52 

1 

0.48 

2 

0.43 

3 

0.45 

4 

0.45 

5 

0.45 

7 

0.52 

8 

0.43 

9 

0.84 

10 

0.47 

11 

0.52 

12 

0.45 

13 

0.43 

14 

0.43 

15 

0.52 

16 

0.45 

17 

0.45 

18 

0.45 

19 

0.47 

20 

0.43 

21 

0.43 

22 

0.45 

23 

0.45 

24 

0.43 

25 

0.45 

27 

0.43 

28 

0.89 

29 

0.45  3 

31 

0.45 

32 

0.57 

33 

0.45 

Network-1  Soft  Inferences 

node 

p(top) 

0 

0.48 

1 

0.48 

2 

0.43 

5 

0.57 

6 

0.48 

7  ■ 

0.48 

9 

0.48 

14 

0.89 

16 

0.48 

17 

0.43 

18 

0.43 

19 

0.48 

20 

0.48 

22 

0.48 

26 

0.48 

47 


Network-4  Soft  Inferences 

node 

0 

1 

3 

0.45 

4 

0.45 

5 

0.45 

7 

0.52 

8 

0.43 

9 

0.85 

10 

0.47 

11 

0.52 

13 

0.43 

14 

0.43 

15. 

0.52 

19 

0.47 

20 

0.43 

21 

0.43 

22 

0.45 

23 

0.45 

24 

0.43 

25 

0.45 

27 

0.43 

28 

0.89 

29 

0.45 

31 

0.45 

32 

0.57 

33 

0.45 

Network-5  Soft  Inferences 

9 

13 

15 

0.45 

25 

0.95 

31 

0.95 

Network-6  Soft  Inferences 

node 

p(top) 

0.89 

1 

0.52 

3 

0.52 

4 

0.52 

6 

0.89 

12 

0.43 

19 

0.67 

21 

0.57 

22 

0.52 

23 

0.52 

31 

0.43 

33 

0.45 

37 

0.60 

38 

0.45 

39 

0.43 

40 

0.43 

41 

0.43 

48 


49 
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